Investigating the Use of Chronological Splitting to Compare Software Cross-company and Single-company Effort Predictions
نویسندگان
چکیده
CONTEXT: Numerous studies have investigated the use of cross-company datasets to estimate effort for single-company projects; however to date only one has compared the effect of using a chronological split instead of a random split to assign projects to a training set and a validation set, finding no significant differences. OBJECTIVE: The aim of this study is to extend [15] using a project-by-project chronological split, and also to investigate how this type of split affects the results when comparing withinto cross-company effort estimation. METHOD: Chronological splitting was compared with two forms of cross-validation. Here a more realistic form of chronological splitting than the one used in [15] is investigated, in which a validation set contains a single project, and a regression model is built from scratch using as training set the set of projects completed before the validation project’s start date. We used 228 single-company projects and 678 cross-company projects from the ISBSG Release 10 repository. RESULTS: We obtained contradictory results when comparing crossto single-company predictions for single-company projects. First, when results were compared using absolute residuals there were no differences between crossand single-company predictions, or between techniques. However, when using z values, chronological splitting favoured cross-company models, and cross-validation (both types) favoured single-company models. CONCLUSIONS: Results were promising when using project-by-project splitting because: i) they favoured cross-company models; and ii) this type of splitting mimics an effort estimation scenario in a real environment. However, these results were obtained using z values only. Therefore we urge future studies comparing prediction models to document results obtained using both z values and absolute residuals, such that a full picture can be provided.
منابع مشابه
Investigating the Use of Chronological Splitting to Compare Software Cross-company and Single-company Effort Predictions: A Replicated Study
CONTEXT: Three previous studies have investigated the use of chronological split to compare crossto single-company effort predictions, where all used the ISBSG dataset release 10. Therefore there is a need for these studies to be replicated using different datasets such that the patterns previously observed can be compared and contrasted, and a better understanding with regard to the use of chr...
متن کاملUsing Chronological Splitting to Compare Cross- and Single-company Effort Models: Further Investigation
Numerous studies have used historical datasets to build and validate models for estimating software development effort. Very few used a chronological split (where projects’ end dates are used so that training sets only contain projects that were completed before the start date of each project in the validation set), and only one compared chronological split to random split. Therefore the aim of...
متن کاملFuzzy Queuing Approach for Designing Multi Supplier Systems (Case: SAPCO Company)
The importance of reliable supply is increasing with supply chain network extension and just-in-time (JIT) production. Just in time implications motivate manufacturers towards single sourcing, which often involves problems with unreliable suppliers. If a single and reliable vendor is not available, manufacturer can split the order among the vendors in order to simultaneously decrease the supp...
متن کاملInvestigating the Risk of Paying Loans to Public and Private Companies Using the Logit Model and Comparing it with Altman Z (Case Study: A Private Bank in Iran)
The design of a credit risk measurement model in the monetary and banking system will play an important role in increasing the profitability of banking resources. This article attempts to use two models of Logit and Z Altman to determine and predict the credit risk of facilities provided to legal entities at a private bank in Iran. The variables studied in this research include qualitative vari...
متن کاملPredicting Web Development Effort Using a Bayesian Network
OBJECTIVE – The objective of this paper is to investigate the use of a Bayesian Network (BN) for Web effort estimation. METHOD – We built a BN automatically using the HUGIN tool and data on 120 Web projects from the Tukutuku database. In addition the BN model and node probability tables were also validated by a Web project manager from a well-established Web company in Rio de Janeiro (Brazil). ...
متن کامل